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Abstract. There are already quite a few tools for solving the SatisEabil- 
ity Modulo Theories (SMT) problems. In this paper, we present VolCE, 
a tool for counting the solutions of SMT constraints, or in other words, 
for computing the volume of the solution space. Its input is essentially 
a set of Boolean combinations of linear constraints, where the numeric 
variables are either all integers or all reals, and each variable is bounded. 
The tool extends SMT solving with integer solution counting and vol¬ 
ume computation/estimation for convex polytopes. Effective heuristics 
are adopted, which enable the tool to deal with high-dimensional prob¬ 
lem instances efficiently and accurately. 


1 Introduction 

In recent years, there have been a lot of works on solving the Satisfiability Modulo 
Theories (SMT) problem. Quite efficient SMT solvers have been developed, such 
as CVC3, Maths AT, Yices and Z3. In [TJ], we studied the counting version 
of SMT solving, and presented some techniques for computing the size of the 
solution space efficiently. This problem can be regarded as an extension to SMT 
solving, and also an extension to model counting in the propositional logic. It has 
recently gained much attention in the software engineering community 

The prototype tool presented in m computes the exact volume of solution 
space. However, exact volume computation in general is an extremely difficult 
problem. It has been proved to be #P-hard, even for explicitly described poly¬ 
topes. On the other hand, it suffices to have an approximate value of the volume 
in many cases. Recently we implemented a tool to estimate the volume of poly¬ 
topes; and integrated it into the framework of [T5] . 

This paper presents the new tool VolC^for the counting version of SMT(LA). 
(Here LA stands for linear arithmetic.) The input of the tool is a set of Boolean 
combinations of linear constraints, where each numeric variable is bounded. In¬ 
dependent Boolean variables may also appear in the constraints. The output of 
the tool is the “volume” of the solution space, or the number of solutions in case 
that the domain consists of integer points. 

^ It is available at http://lcs.ios.ac.cn/'zj/voice 10x64.tar.gz 
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The rest of this paper is organized as follows. Section presents the archi¬ 
tecture of Voice. Section presents the algorithm and implementation of our 
volume estimation sub-procedure. Section [^briefly describes how to use the tool. 
Section [^discusses our experiments, and Section [^describes some related works. 
We conclude in Section 0 

2 Architecture of VolCE 

The architecture of VolCE is illustrated in FigureRecall that in [T3], a consis¬ 
tent conjunction of linear constraints that satisfies the boolean skeleton of the 
SMT(LA) formula is called a feasible assignment. The sum of volumes of all fea¬ 
sible assignments is the volume of the whole formula. In VolCE, the SAT solving 
engine (MiniSatQ and the linear arithmetic solver (lp_solv^ work together 
to find feasible assignments. Each time a feasible assignment is obtained, VolCE 
tries to reduce it to a partial assignment that still propositionally satisfies the 
formula. The resulted feasible partial assignment may cover a bunch of feasi¬ 
ble assignments, hence is called a “bunch”. Then a solution counting or volume 
computation/estimation sub-procedure is called for the polytope corresponding 
to each bunch rather than each feasible assignment, so that the number of calls 
is reduced. For more details of the main algorithm, see m- 


Feasible 



Fig. 1. The Architecture of VolCE 


In addition to MiniSat and lp_solve, VolCE calls Vinci [3] and LattE 
to help compute the size of the solution space. Vinci is a tool for computing 
the volume of a convex body. LattE is a tool for counting lattice points inside 
convex polytopes and solutions of integer programs. Moreover, we implemented 
a tool for estimating the volume of convex polytopes, called PolyVest [8]. It will 
be elaborated in the subsequent section. 

^ The MiniSat Page, http://minisat.se/ 

^ Available at http://lpsolve.sourceforge.net/ 
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3 Volume Estimation 

The performance of volume computation packages for convex polytopes is the 
bottleneck of the prototype tool in [T3]. Recently we augmented it with an effi¬ 
cient volume estimation sub-procedure for convex polytopes. 


3.1 Volume Estimation for Convex Polytopes 

A straightforward way to estimate the volume of a convex body is the Monte- 
Carlo method. However, it suffers from the curse of dimensionalitjQ which means 
the possibility of sampling inside a certain space in the target object decreases 
very quickly while the dimension increases. As a result, the sample size has to 
grow exponentially to achieve a reasonable estimation. To avoid the curse of di¬ 
mensionality, Dyer et.al. proposed a polynomial time randomized approximation 
algorithm (Multiphase Monte-Carlo Algorithm) [5]. The theoretical complexity 
of the original algorithm is and is recently reduced to 0*(n‘^) [ID- 

Based on the Multiphase Monte-Carlo method, we implemented our own tool 
PolyVest (Polytope Volume Estimation) to estimate the volume of convex poly¬ 
topes [5]. One improvement over the original Multiphase Monte-Carlo method 
is that we developed a new technique to reutilize sample points, so that the 
number of sample points can be significantly reduced. 

In the sequel, we briefly describe the algorithm implemented in PolyVest. 
For more details, one can refer to [8]. We assume that P is a full-dimensional and 
nonempty convex polytope. We use vol{K) to represent the volume of a convex 
body AT, and B{x,R) to represent the ball with radius R and center x. 




Fig. 3. Hit-and-run 


The basic procedure of PolyVest consists of the following three steps: round¬ 
ing, subdivision and sampling. 

http://en.wikipedia.org/wiki/Curse_of_dimensionality 
® “soft-O” notation O* indicates that we suppress factors of logn as well as factors 
depending on other parameters like the error bound 
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Rounding First we find an affine transformation T on polytope P so that 
T(P) contains the unit ball i?(0,1), and is contained in the ball B{0,r). This 
can be achieved by applying the Shallow-/3-Cut Ellipsoid Method m- We set 
r = 2n in our implementation. Rounding is an essential step. For example, it is 
difficult to subdivide a very “thin” polytope and sample on it without rounding. 
For simplicity, we still use P to denote the new polytope T{P) after rounding. 

Subdivision Then we divide P into a sequence of convex bodies. The general 
idea of the subdivision step is illustrated in Figure We place I concentric balls 
{Bi} between B{0, 1) and B{0,r). Let Ki denote the convex body Bi n P, then 


vol{P) 


i-i 

= vol{Ko) 

i=0 


vol{K,+i) 

vol{Ki) 


Let at denote the ratio vol{Ki+i) / vol{Ki) , then 


/-I 

vol{P) = vol{Ko) tti- 

i=0 

Hence the volume of the polytope P is transformed to the products of ratios and 
the volume of Kq. Note that Kq = B{0, 1) whose volume can be easily computed. 
So we only have to estimate the value of a^. 

Of course, one would like to choose the number of concentric balls, I, to be 
small. However, one needs about 0{lai) random points to get a sufficiently good 
approximation for ai. It follows that the must not be too large. In PolyVest, 
we set I = [nlog 2 r] and Bi = il(0, 2*/”) to construct the convex bodies {Ki}. 
And it can be proved that 1 < < 2 with this construction. 

Sampling Finally, we generate S points in Ki+i and count the number of 
points Ci in Ki. Thus ai can be approximated with S/ci. Generating independent 
uniformly distributed random points in {Ki} is not as simple as in cubes or 
ellipsoids. So we use a hit-and-run method for sampling. Hit-and-run method is 
a random walk which can generate points with almost uniform distribution in 
polynomial time [T]. Figure illustrates the hit-and-run method: It starts from 
a point a:, then randomly selects a line L through x and choose the next point 
x' uniformly on the segment in P of line L. PolyVest adopts the coordinate 
directions hit-and-run method, in which the random direction of line L is chosen 
with equal probability from the coordinate direction vectors and their negations. 

Reutilization of Sample Points In the original Multiphase Monte Carlo 
method, the ratios Ui are estimated in natural order, from the first ratio ao 
to the last one a;_i. And the method starts sampling from the origin. However, 
our implementation works in the opposite way. It generates sample points from 
the outermost convex body Ki to the innermost convex body Kq, and the ratios 
are estimated accordingly in reverse order. 

The advantage of approximation in reverse order is that it is possible to fully 
exploit the sample points generated in the previous phases. Since Kq C Ki C 
■ ■ • G Ki, the sample points in Ki still fall in Kj {i < j). On the other hand, 
the sample points generated by the hit-and-run method are almost independent 
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as sample size S is large enough. Therefore, for any i that i < the points 
generated for approximating aj that hit Kij^i can serve as sample points to 
approximate as well. It can be easily proved that we only need to generate 
less than half sample points with this technique since at < 2. In practice, this 
technique can save over 70% time consumption under most circumstances. 


3.2 Volume Estimation for SMT(LRA) Formula 

Now we describe how to estimate the “volume” of the solution space of SMT(LRA) 
formulas. (Here LRA stands for linear real arithmetic.) The basic procedure is 
quite similar to that of volume computation as described in |13j . As in ng, a 
consistent conjunction of linear constraints that satisfies the boolean skeleton 
of the SMT(LRA) formula is called a feasible (partial) assignment. Each time 
Voice obtains a feasible (partial) assignment, it calls PolyVest to estimate the 
volume of the polytope corresponding to this assignment. The sum of estimated 
volumes of all feasible (partial) assignments is approximately the volume for the 
whole formula. Note that the “volume computation in bunches” strategy in m 
can also be applied in volume estimation. 

In the Multiphase Monte-Carlo method, the number of sample points at each 
phase is a key parameter. As the sample size increases, the accuracy of estimation 
improves, and the estimation process also takes more time. It is important to 
balance the accuracy and run time, especially for VolCE since the estimation 
subroutine PolyVest is usually called many times. 

Voice employs a two-round strategy that can dynamically determine a proper 
sample size for each feasible (partial) assignment. At the first round of estima¬ 
tion, each feasible assignment is sampled with a fixed small number of random 
points to get a quick and rough estimation. Since the volumes of feasible as¬ 
signments may vary a lot, intuitively a feasible assignment with relatively larger 
volume should be estimated with higher accuracy. Hence at the second round, 
the sample size for each assignment is determined according to its estimated vol¬ 
ume from the first round. More specifically, we use the following rule to decide 
the sample sizes in the second round: 

- Suppose the sample size in the first round is Smin, and the largest sample 
size in the second round is set to Smax- Let Vmax denote the largest esti¬ 
mated volume in the first round, and Vi denote the volume of the fth feasible 
assignment estimated in the first round. Then the sample size Si for the fth 
feasible assignment in the second round is: 

Q ^ ^ Sjjiax ^ Vi 

* ~ ’ V ' 

If Si < Smin, the ith feasible assignment is neglected at the second round, 
and we use the result from the first round as its estimated volume. If Si > 
Smax, then set Si to Smax- 
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Through statistical results of substantial experiments, we find that setting Smin 
to 40/ and Smax to 1600/ (/ = |’nlog 2 r]) is very effective. It only needs to gen¬ 
erate Smin/Smax = 1/40 points in extreme cases with this strategy. In practice, 
it usually saves more than 95% points for random instances. 

4 Using the Tool 

The input of VolCE is an SMT formula where the theory T is restricted to the 
linear arithmetic theory. It can be regarded as a Boolean combination of linear 
inequalities. There are two input formats. 

For the first format, the input formula is a Boolean formula </>(/>!,..., bn) in 
conjunctive normal form (CNF). And each Boolean variable bi can stand for a 
linear arithmetic constraint (LAC). The whole input file is an extension of the 
DIMACS format for SAT solving. An alternative input format is SMT-Lib style. 
Currently, VolCE supports the main features of the “SMT-LIBv2” syntax. 
Voice has several command-line options: 

- -V asks the tool to call Vinci to compute the volume. 

- -P asks the tool to call PolyVest to approximate the volume. 

- -L asks the tool to call LattE to count integer solutions. 

- -w=NUMBER specifies the word length of variables in bit-wise representations. 

- -maxc=NUMBER sets the maximum sampling coefficient of PolyVest, which is 
an upper bound. 

- -minc=NUMBER sets the minimum sampling coefficient of PolyVest. 

For more details about using the tool, see the manual. Here we just give an 
example to show its application to program analysis. 

Example from Program Analysis 

id analyze the execution frequency 



(NOT ((c = 32) OR (c = 9) OR (c = 10))) AND 
((c != 46) AND ((c < 48) OR (c > 57))) 

Here c is a variable of type char; it can be regarded as an integer variable within 
the domain [-128..127]. For this path condition, we can compute the number of 
solutions using VolCE. With option -L (i.e., using LattE), the tool tells us that 
the constraint has 242 solutions. (We do not need to use the option -w=8, because 
the default word length is 8.) Given that the size of the whole search space is 256, 
we conclude that the frequency of executing Pathl is about 0.945 (242/256). This 
means, if the input string has only one character, most probably, the program 
will follow this path. 


The path condition is a set of constraints such that any input data satisfying these 
constraints will make the program execute along that path. 
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Another path, Path2, has a more complicated path condition (omitted here, 
due to the lack of space). It involves 3 input variables cO, cl and c2. Given the 
second path condition, our tool tells us that the number of solutions is 8085. So 
the path execution frequency is 8085/(256 * 256 * 256) which is roughly 0.00048. 

5 Experimental Results 

In this section, we report some experimental results about our tool. The experi¬ 
ments are performed on a workstation with 3.40GHz Intel Gore 17-2600 GPU and 
8GB memory. In all experiments, the parameter Smax is set to 1600, and Smin 
set to 40. The domain of each numeric variable is set to [-128..127] by default. 


Benchmarks The following instances have been tested. 

- Instances generated from static program analysis. We analyzed the following 
programs: 

• abs: a function which calculates absolute value; 

• findmiddle: a function which finds the middle number among 3 numbers; 

• Space_manage: a program related to space technology; 

• tritype: a program which determines the type of a triangle; 

• calDate: a function which converts the special date into a Julian date; 

- Instances from SMT-Lib, including: (1) the QF_LRA benchmarks Arthain, 
atan; and (2) the QF_LIA benchmarks bignum, simplebitadder, fischer, 
pigeon-hole, prime_cone. 

- Random instances ranJ_c_d: which have d numeric variables, i inequalities 
and c clauses. They are generated by randomly choosing coefficients of LAGs 
and literals of clauses. The length of each clause is between 3 to 5. 

In the following tables, “—” means that the instance takes more than one 
hour to solve (or the tool runs out of memory). 

Tablej^shows the results of comparison between volume estimation and com¬ 
putation. “Dims” represents the number of numeric variables in LAGs. “Ineqs” 
represents the number of LAGs and also represents the number of boolean vari¬ 
ables in the boolean skeleton. “Glauses” represents the number of clauses in 
the boolean skeleton. “Bunches” represents the number of feasible partial as¬ 
signments obtained by VolCE and also represents the times of VolCE calling 
PolyVest or Vinci. 

Observe that VolCE with PolyVest is very efficient and the relative errors of 
approximation are small. When the dimension of instance grows to 8 or larger, 
VolCE with Vinci often fails to give an answer in one hour or depletes memory. 
Though “Vinci” has an option to restrict its memory storage, as a tradeoff it 
will take much more time to solve, and still cannot solve instances within the 
timeout. 

Table shows the results of our tools with two-round strategy on random 
instances. Golumn “Sample” represents the average coefficient of the sample 


Table 1. Comparison between Volume Estimation and Computation 







PolyVest 

Vinci 

Instance 

Dims 

Ineqs 

Clauses 

Bunches 

Result 

Time(s) 

Result 

Time(s) 

abs_0 

1 

1 

1 

1 

127 

0.000 

127 

0.000 

findmiddle_2 

3 

6 

20 

2 

5369040 

0.016 

5527125 

0.000 

findmiddle_3 

3 

6 

20 

7 

5696270 

0.020 

5527125 

0.004 

Space_manage_l 

17 

3 

7 

1 

2.35e-h39 

2.6 

— 

— 

ArthanlA-chunk-0015 

4 

6 

16 

1 

2.64e-3 

0.016 

2.57e-3 

0.002 

atan-problem-2- 

weak-chunk-0200 

3 

6 

19 

1 

2.71 

0.008 

2.67 

0.014 

ran_15_45_7 

7 

15 

45 

113 

1.84e+15 

1.4 

1.84e-tl5 

12 

ran_20_60_7 

7 

20 

60 

254 

6.68e+14 

2.75 

6.74e-tl4 

84 

ran_30_90_7 

7 

30 

90 

401 

4.62e+13 

5.9 

4.58e-tl3 

802 

ran_15_45_8 

8 

15 

45 

214 

3.58e-tl7 

2.9 

3.50e-tl7 

72 

ran_20_60_8 

8 

20 

60 

480 

1.07e-tl7 

6.3 

1.09e-tl7 

259 

ran_30_90_8 

8 

30 

90 

1135 

6.73e+16 

20.7 

— 

— 

ran_20_60_9 

9 

20 

60 

425 

1.20e+19 

8.3 

— 

— 

ran_20_50_9 

9 

20 

50 

691 

2.86e+19 

11.6 

— 

— 

ran_20_60_10 

10 

20 

60 

949 

2.51e-t22 

20.3 

— 

— 


size of Si- In the original algorithm, this average value always equals to S' = 
Smax = 1600. Values in column “Ratio” are approximation of saved sample 
points by the two-round strategy. Obviously, the two-round strategy could save 
much time without losing much accuracy. The differences of the results between 
original and two-round strategy are usually less than 5%. Besides, the two-round 
strategy could save 93% to 97% sample points and save more than 90% time. 


Table 2. Benefits of the Two-Round Strategy 



Original 

Two-Round 


Dim. 

Ineq. 

Clause 

Bunch 

Result 

Time(s) 

Result 

Time(s) 

Sample 

Ratio 

8 

15 

45 

214 

3.53e-hl7 

31 

3.58e-tl7 

2.9 

101.7 

93.6% 

8 

20 

60 

480 

1.10e-tl7 

78.9 

1.07e-tl7 

6.3 

75.5 

95.3% 

8 

30 

90 

1135 

6.94e+16 

210.4 

6.73e-tl6 

20.7 

54.7 

96.6% 

10 

15 

45 

228 

1.20e-t23 

68.5 

1.20e-t23 

4.7 

63.7 

96.0% 

10 

20 

60 

949 

2.52e-h22 

312.8 

2.51e-t22 

20.3 

61.5 

96.2% 

10 

30 

90 

1394 

8.06e-tl8 

524.6 

7.92e-tl8 

39.1 

66.0 

95.9% 

15 

40 

200 

1710 

7.93e-h27 

2958.5 

7.94e-t27 

189.1 

44.6 

97.2% 

15 

50 

250 

495 

1.22e-t23 

984.4 

1.23e-t23 

67.1 

48.3 

97.0% 

20 

40 

200 

8095 

— 

— 

2.08e-t40 

2283 

41.3 

97.4% 

20 

60 

400 

689 

— 

— 

6.71e-t32 

285 

48.8 

97.0% 

30 

60 

400 

886 

— 

— 

6.59e-t54 

1528 

44.0 

97.3% 

40 

80 

550 

451 

— 

— 

6.12e-t66 

2806 

43.5 

97.3% 


Tableshows the results of experiments with some larger random instances. 
Note that the number of dimensions and bunches are the key parameters of the 
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Table 3. Experiments on Larger Instances 


Dims 

Ineqs 

Clauses 

Bunches 

Result 

Time(s) 

30 

40 

200 

5200 

1.42e-t64 

4314 

30 

60 

400 

886 

6.59e-t54 

1528 

30 

80 

500 

610 

2.01e-t41 

1227 

40 

60 

400 

1752 

5.23e-t80 

6437 

40 

80 

550 

451 

6.12e-t66 

2806 

40 

100 

750 

111 

5.00e-t63 

916 

50 

100 

750 

24 

5.63e-t74 

650 


scale of instances. The larger the number of dimensions or bunches, the more 
time the tool has to run. VolCE can handle instances around 30-dimensions in 
reasonable time and up to 50-dimensions with a few bunches. 

Table 1^ are the results of counting integer solutions with LattE. It shows that 
LattE can handle some problems up to 17 dimensions. However, LattE cannot 
solve the random instance “ran_15_45_7”. The inequalities in the instance are 
quite complicated. 


Table 4. Experiments about counting integer solutions 


Instance 

Dims 

Ineqs 

Bunches 

Result 

Time(s) 

abs_0 

1 

1 

1 

128 

0.000 

findmiddle_2 

3 

6 

2 

5527040 

0.004 

tritype_16 

4 

18 

4 

8323072 

0.051 

calDate_13 

6 

5 

1 

7.99e-bll 

0.028 

Space_manage_l 

17 

3 

1 

2.69e-b39 

170.3 

bignumjial 

6 

13 

0 

0 

0.028 

bignum_lia2 

6 

13 

1 

0 

0.036 

SIMPLEBITADDER_2 

12 

51 

98 

0 

1.3 

FISCHERl-l-fair 

4 

16 

1 

256 

0.020 

FISCHERl-2-fair 

6 

28 

0 

0 

0.004 

prime_cone_sat_2 

2 

5 

1 

4159 

0.004 

ran_15_45_7 

7 

15 

45 

— 

— 


6 Related Works 

There was little work on the counting of SMT solutions, until quite recently. 

Fredrikson and Jha [7] relate a set of privacy and confidentiality verification 
problems to the so-called model-counting satisfiability problem, and present an 
abstract decision procedure for it. They implemented this procedure for linear- 
integer arithmetic. Their tool is called countersat. It is not available to us. 

Zhou et al. HH] propose a BDD-based search algorithm which reduces the 
number of conjunctions. For each conjunction, they propose a Monte-Carlo in¬ 
tegration with a ray-based sampling strategy, which approximates the volume. 
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Their tool is named RVC. It can handle formulas with up to 18 variables. But 
the running time is dozens of minutes. 

A different approach is described in [3] . It reduces an approximate version of 
#SMT to SMT. The approach does not need to modify existing SMT solvers. It has 
been applied to solve a value estimation problem for certain kind of probabilistic 
programs. We do not know how large the benchmarks are, and it is not clear 
about the quality of the approximation. 

7 Concluding Remarks 

Voice is a tool for computing and estimating the volume of the solution space 
(or counting the number of solutions), given a formula/constraint which is a 
Boolean combination of linear arithmetic inequalities. VolCE is very flexible to 
use. For medium sized SMT(LA) formulas, it can provide exact volume compu¬ 
tation results or exact number of solutions. For larger SMT(LA) formulas, it can 
quickly perform volume estimation with high accuracy, due to the use of effective 
heuristics. We believe that the tool will be useful in a number of domains, such 
as program analysis and probabilistic verification. 
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A Introduction of VolCE 

A.l What is VolCE? 

Voice is designed for computing or estimating the size of the solution space of 
an SMT formula where the theory T is restricted to the linear arithmetic theory. 
(SMT stands for Satishability Modulo Theories.) The prototype tool presented 
in |13j computes the exact volume of the solution space. However, exact volume 
computation in general is an extremely difficult problem. It has been proved 
to be #P-hard, even for explicitly described polytopes. On the other hand, it 
suffices to have an approximate value of the volume in many cases. Later we 
implemented a tool to estimate the volume of polytopes [5] and integrated it 
into the framework of [13]. The new tool is called VolCE. It can efficiently handle 
instances of dozens of dimensions with high accuracy. In addition, VolCE also 
accepts constraints involving independent Boolean variables. 

A.2 What can VolCE do? 

VolCE uses the following three packages: 

PolyVest |S], which can be used to estimate the volume of polytopes 

- Vinci HZ], a software package that implements several algorithms for (exact) 
volume computation. 

- LattE (Lattice point Enumeration) [T3j, a software package dedicated to the 
problems of counting lattice points and integration inside convex polytopes. 


B Installing VolCE 

- Step 1: Make sure that g-l—I- (version 4.8 or higher version) is installed on 
your machine (you can type “g++ -v” to check this). 
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- Step 2: The functionality of VolCE is dependent on some other libraries: 
zlib, boost,Ipsolve, glpk, gfortran, LAPACK, BLAS and Armadillo. On 
Ubuntu or Debian, you can use “apt-cache search” and “apt-get install” to 
find and install all of these libraries. Or you can download these libraries 
from: 


Library 

URL 

zlib 

http://zlib.net/ 

boost 

http: //www.boost.org/ 

Ipsolve 

http: //sourceforge.net/projects/lpsolve/ 

glpk 

http: //www.gnu.org/software/glpk/ 

gfortran 

http://gcc.gnu.org/fortran/ 

LAPACK 

http: //www.netlib.org/lapack/ 

BLAS 

http: //www.netlib.org/blas/index.html 

Armadillo 

http: / / arma.sourceforge.net / 


Note. For Ipsolve, make sure that its header files and the dynamic library 
(i.e., liblpsolveBB. so) are included in the directories “/usr/include/lpsolve” 
and “/usr/lib/lp_solve” , respectively. Besides, LAPACK and BLAS should 
be installed before installing Armadillo. 

- Step 3: Open a shell (command line), change into the directory that was 
created by unpacking the VolCE archive, and type: 

sh build.sh 

When the build process is finished, you will find all binaries in the directory 
“release/”. 

- Step 4: Install LattE [T3]. Build and move the executable files (count and 
scdd_gmp) into directory “release/bin/”. 

This release of VolCE has been successfully built on the following operating 
systems: 

- Ubuntu 12.04 on 64-bit with g-P-l- 4.8.1 

- Ubuntu 13.10 on 32-bit with g-P-l- 4.8.1 

C Input Format 

The input of VolCE is an SMT formula where the theory T is restricted to 
the linear arithmetic theory. It involves variables of various types (including 
integers, reals and Booleans). We usually use bi to denote Boolean variables, Xj 
to denote numeric variables. In the input formula, there can be logical operators 
(like AND, OR, NOT), arithmetic operators (like addition, subtraction, scalar 
multiplication) and comparison operators (like <, <, >, >, =, yf). 

Syntactically, there are two formats for the input file: 




















13 


Voice style 
SMT-LIBv2 m 

We now describe them in detail. 


C.l Voice style Input Format 

Let us introduce some concepts first. 

- LAC: A linear arithmetic constraint (LAC) is a comparison between two 
linear arithmetic expressions. Such a constraint can be denoted by a Boolean 
variable (e.g., bz = X\+X 2 < 1 ). 

- literal: A literal is either a Boolean variable (e.g., 63 ) or a negated Boolean 
variable (e.g., NOT 63 ). 

- clause: A clause is a set of one or more literals, connected with OR. (Boolean 
variables may not be repeated inside a clause.) 

- formula: A formula is a set of one or more clauses, connected with AND. 

It is well known that any Boolean expression can be converted into the con¬ 
junctive normal form (CNF) easily (e.g., using the Tseitin transformation |16jl. 
So the input of VolCE is a formula in the CNF form, where each Boolean variable 
may stand for some LAC. The input file generally consists of two parts: LACs, 
and clauses in the CNF. 

An example of VolCE style formula is the following: 

&i = a;i < X2, 

63 = Xi -I- X 2 < 1, 

64 = Xi < 1, 

h = X2< 1 , 

60 = xi > 0, 
bj = X2> 0 , 

(61 OR (NOT 63)) AND 
(61 OR (NOT 62) OR 63) AND 
((NOT 63) OR 64) AND 
64 AND 65 AND be AND 67 . 

There are 7 Boolean variables ( 61 , ..., 67 ), 2 numeric variables {xi and X 2 ), 
6 LACs and 7 clauses in Formula [l] Note that 62 is an independent Boolean 
variable which does not represent any LAC. 

Syntactically, VolCE accepts input in an “Enhanced DIMACS CNF Format”. 
Every line beginning with “c” is a comment. The hrst non-comment line must 
be of the form: 


p cnf V Ic BOOLS CLAUSES NUMVARS LACS 
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It specifies the number of Boolean variables, the number of clauses, the number 
of numeric variables and the number of linear constraints. 

Every line beginning with “m” defines a linear constraint and its correspond¬ 
ing Boolean variable. It must be of the form: 

m i al ... an op b 

It defines a linear inequality oiXi -I- • • • -f a„x„ op b, where ai, ..., a„, b are 
constants, and op is a comparison operator: <, <=, >, >= or =. (The tool does 
not support 7 ^ directly. However, ax ^ b NOT ax = b.) The number i 
means the Boolean variable hi represents this inequality. The space between the 
character “m” and the number i is not mandatory. 

Each of the other lines defines a clause: a positive literal is denoted by the 
corresponding number (so 4 means 64 ), and a negative literal is denoted by the 
corresponding negative number (so -5 means NOT 65 ). The last number in the 
line should be zero. Each of these lines is a space-separated list of numbers. 

So the above Formula [l] would be written in the following way: 

c It is an example, fl.vs. 
p cnf V Ic 7 7 2 6 
c Linear Constraints part. 
ml 1 -1 < 0 
m3 1 1 < 1 

m4 1 0 <= 1 

m5 0 1 <= 1 

m 6 1 0 >= 0 

m7 0 1 >= 0 

c CNF part. 

1-3 0 
1-230 
-13 0 

4 0 

5 0 

6 0 
7 0 

See the file examples/f 1 . vs. 

C.2 The SMT-LIBv2 Language Inputs 

Voice also partially supports the SMT-LIBv2 language. For details of this lan¬ 
guage, visit the website: 

\protect\vrule widthOpt\protect\href{http://www.smt-lib.org/}{http://www.smt-lib.org/} 

Voice recognizes SMT-LIBv2 format from the file name extension “.smt2”. 

It automatically parses such a file into the VolCE style input. 
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Table lists the commands, variable types and identifiers of SMT-LIBv2 
language that supported by VolCE. VolCE ignores some basic commands like 
set-logic, set-info, check-sat, exit. It directly checks all of the assertions. 
Besides, assert commands must be written after all the declare-fun com¬ 
mands. 


Table 5. Supported SMT-LIBv2 Components 


Commands 

declare-fun assert 

Variable Types 

Int Real Bool 

Identifiers 

let 

and or not => ite 

+ - * / 

= >>=<<= distinct 


In the SMT-LIBv2 language, the above Formula would be written like this: 

(set-logic QF_LRA) 

(set-info :fl.smt2) 

(set-info :smt-lib-version 2.0) 

(set-info :status sat) 

(declare-fun x () Real) 

(declare-fun y () Real) 

(declare-fun b () Bool) 

(assert (and (<= x 1) (<= y 1) (>= x 0) (>= y 0))) 

(assert (let ((vl (< (+ x y) 1)) (v2 (< x y))) 

(and (or vl (not v2)) (or vl v2 b) (or (not vl) v2)))) 

(check-sat) 

(exit) 

See the file examples/f 1. smt2. 

D Running VolCE 

To run VolCE, you should switch your working directory to the absolute path of 
Voice. 

VolCE has a help menu. To view it, simply type the command ". /voice —help". 
The general usage of VolCE is 

7. ./voice [OPTION]... <1NPUT-F1LE> 

The meanings of the options are given in the following table. 

Table 6: Command-line Options of VolCE 


Option 


Meaning 
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-p 

Enables PolyVest for volume approximation. The input vari¬ 
ables in the linear inequalities are reals. By default, VolCE calls 
PolyVest. It assumes that all the numeric variables are reals. 

-V 

Enables Vinci for volume computation. The input variables in 
the linear inequalities are reals. 

-L 

Enables LattE to count the number of integer solutions. The 
input variables in the linear inequalities should be integers. This 
option is usually enabled in the case of integer variables. 

-w=NUMBER 

Specifies the word length of numeric variables in bit-wise repre¬ 
sentations. Then each variable is automatically bounded by the 
range [—2“’“^, 2““^ — 1]. Setting the word length to 0 will disable 
this feature. By default, the word length is 8, which means the 
domain of every numeric variable is [—128,127]. For example, 
you can change it to 3, by using the option -w=3. 

-maxc=NUMBER 

Sets the maximum sampling coefficient of PolyVest, which is an 
upper bound. Generally, the larger this coefficient is, the more 
accurate the result will be. However, the running time of the tool 
will be longer. The default value of maxc is 1600. For example, 
you may change it to 1000, by using the option -maxc=1000. 

-minc=NUMBER 

Sets the minimum sampling coefficient of PolyVest. The default 
value is 40. 


To estimate the volume of the solution space of Formula [T] simply type: 

7o ./voice examples/f 1.vs 

Note that Formulaj^guarantees 0 < a:i, 0:2 < 1- So we can disable the internal 
bit-wise bounds of numeric variables, by setting the word length to 0: 

7, ./voice -w=0 examples/f 1 .vs 

You can also enable PolyVest, Vinci, LattE at the same time: 

7o ./voice -P -V -L -w=0 examples/f 1 .vs 

Remarks Several tools (PolyVest, Vinci, LattE) have been integrated which 
can be enabled for different situations. Vinci gives an accurate volume for a 
polytope; but it may have difficulty handling problem instances with more than 
10 numeric variables. PolyVest gives approximate results, but it can deal with 
larger instances. LattE is good at counting the number of integer solutions. 
Sometimes, the hrst two tools can also be used for approximating the number 
of integer solutions. 

E Examples 

Example 1 For the above Formula we have two input files: VolCE style input 
(fl.vs) and SMT-LlBv2 input (fl.smt2). 

Execute the command: 
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7„ ./voice -P -V -L -w=0 examples/f 1. snit2 
And we obtain the result: 

Enabled PolyVest. 

Enabled Vinci. Total approximation: 0.755039 

Enabled LattE. 

Set word length to 0. ================================ 

Disabled default bounds since wo 

rd length <= 0. The total volume (PolyVest): 0.7 

Voice Directory: ... 5503900 

Working Directory: ... 


============ Vinci ============= 

Parsing smt2 file. ================================ 

Reading Input. 

Number of bool vars: 16 0.25000000 * 2 

Number of clauses: 29 0.25000000 * 1 

Number of numeric vars: 2 

Number of linear constraints: 6 ================================ 

================================ The total volume (Vinci): 0.7500 

0000 

Brauiches: 2 
SATISFIABLE 


■==================== ============ LattE 

PolyVest =========== ==================: 


0*2 

FIRST ROUND 2*1 

0 0.222875 * 2 

1 0.24338 * 1 ================================ 

SEC & LAST ROUND The total volume (LattE): 2 

0 1600 0.252037 * 2 
1 1600 0.250964 * 1 

Analysis: 

Figurej^shows the linear constraints in Formula[^ The plane is splitted into 
4 areas, since 64 , 65 , be, 67 are always True. The pair { 61 , 63 } determines the 
counted areas. 

- Area I: {bi = True, 63 = True}. It has no lattice points. 

- Area II: {&i = True, 63 = False}. It has 1 lattice point: {0, I}. 

- Area III: {bi = False, 63 = False}. It has 2 lattice points: {I, 0} and {1, I}. 
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Fig. 4. Solution Space of Formula 


- Area IV: {bi = False, 63 = True}. It has 1 lattice point: {0, 0}. 

There are 3 Boolean solutions for Formula[^ {bi = True, 62 = True, 63 = True}, 
{5i = True, 62 = False, 63 = True}, and {bi = False, 62 = True, 63 = False}. 
Thus the volume of the solution space is 2 x vol{Areaj) + vol{Areajjj) = 0.75. 
And there are 2 x 0 + 2 = 2 integer solutions (lattice points). 

Example 2 Here is an exercise for young pupils: In the following square, there are 
8 sub-areas. Color them so that the neighboring sub-areas use different colors. 
How many different coloring schemes are there? 


A 

D 

F 

B 

G 

E 

C 

H 


Obviously, this problem can be regarded as a solution counting problem. The 
input consists of the following inequalities: 

xA ^ xB, xA ^ xD, xB ^ xC, xB ^ xD, xB ^ xE, 

xC ^ xE, xD ^ xA, xD ^ xF, xD ^ xG, 
xE ^ xG, xE ^ xi7, xF ^ xG, xG ^ xEl. 

We assume that there are at most 4 colors, and execute the following command: 
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7„ ./voice -L -w=2 examples/coloring. snit2 
We find that there are 768 solutions. 

Example 3 In [13], we describe a program called getop() and analyze the execu¬ 
tion frequency of its paths. For Pathl, its path conditiorQis: 

(NOT ((c = 32) OR (c = 9) OR (c = 10))) AND 
((c != 46) AND ((c < 48) OR (c > 57))) 

Here c is a variable of type char; it can be regarded as an integer variable within 
the domain [-128..127]. 

For the above path condition, we can compute the number of solutions by 
executing the command: 

7„ ./voice -L excmiples/program_analysis/getopPathl. smt2 

We find that the path condition has 242 solutions. (We do not need to use the 
option -w=8, because the default word length is 8.) 

Given that the size of the whole search space is 256, we conclude that the 
frequency of executing Pathl is about 0.945 (i.e., 242/256). This means, if the 
input string has only one character, most probably, the program will follow this 
path. 

Another path, Path2, has the following path condition: 

((cO = 32) OR (cO = 9) DR (cO = 10)) AND 
(NOT ((cl = 32) OR (cl = 9) DR (cl = 10))) AND 
(NOT ((cl != 46) AND ((cl < 48) DR (cl > 57)))) AND 
(NOT ((c2 >= 48) AND (c2 <= 57))) AND 
(NOT (c2 = 46)) 

Given this set of constraints, and using LattE, our tool tells us that the number 
of solutions is 8085. The executed command is: 

7o ./voice -L excmiples/program_aiialysis/getopPath2. smt2 

So the path execution frequency is 8085/(256*256*256) which is roughly 0.00048. 

Example 4 Hoare’s program FIND takes an array A[N] and an integer as input, 
and partitions the array into two parts. 

Assume that N = 8. We may extract two execution paths from the program, 
and generate the path conditions. The first path condition is the following: 

(A[0] < A[3]); !(A[1] < A [3] ) ; (A[3] <A[7]); 

!(A[3] < A[6]); !(A[2] < A [3] ) ; !(A[3] <A[5]); 

!(A[3] < A[4]); (A[0] < A [4]); (A[6] <A[4]); (A[5] <A[4]). 

^ The path condition is a set of constraints such that any input data satisfying these 
constraints will make the program execute along that path. 
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Setting the word length to 4, we can find that the number of solutions is 4075920. 
The executed command is: 

7„ ./voice -L -w=4 exainples/program_analysis/FINDpathl. smt2 

The second path condition is a bit more complicated: 

!(A[0] < A[3]); (A[3] < A [7] ) ; (A[3] < A[6]); 

(A[3] < A[5]); (A[3] < A[4]); !(A[1] < A[3]); 

(A[3] < A[2]); (A[3] < A[l]); (A[l] < A [0] ) ; 

(A[2] < A[0]); !(A[0] < A [7] ) ; !(A[4] <A[0]); 

(A[0] < A [6]); !(A[0] < A [5] ) ; (A[l] < A [7]); 

(A[2]<A[7]); !(A[7] < A[5]); ! (A [1] < A [5] ) ; 

!(A[2] < A[5]); (A[5]<A[2]); (A[2]<A[1]). 

Executing the command: 

7, ./voice -L -w=4 exainples/program_analysis/FlNDpath2. smt2 

we find that the number of solutions is 87516. So, the first path is executed much 
more frequently than the second one. (We assume that the input space is evenly 
distributed.) 

Example 5 Let us try a randomly generated example (raii_5_20_8.vs). It has 5 
Boolean variables, 8 numeric variable, 20 clauses and 5 linear constraints. 
Execute the command: 

7« ./voice -P -V -w=4 examples/ran/ran_5_20_8.vs 
And we obtain the result: 

Enabled PolyVest. SATISFIABLE 

Enabled Vinci. 

Set word length to 4. ================================ 

Voice Directory: ... =========== PolyVest =========== 

Working Directory: ... ================================ 

================================ first round 

0 8.38972e+06 * 1 

Reading Input. 

Number of bool vars: 5 SEC & LAST ROUND 

Number of clauses: 20 0 1600 7.88093e+06 * 1 

Number of numeric vars: 8 

Number of linear constraints: 5 Total approximation: 7.88093e+06 


Bramches: 1 


The total volume (PolyVest): 788 
















21 


0930.00000000 


7970738.22355500 * 1 


Vinci ============= The total volume (Vinci): 797073 

=================== 8.22355500 













